Solving the class imbalance problem using a counterfactual method for data augmentation

نویسندگان

چکیده

Learning from class imbalanced datasets poses challenges for many machine learning algorithms. Many real-world domains are, by definition, virtue of having a majority that naturally has more instances than its minority (e.g., genuine bank transactions occur much often fraudulent ones). methods have been proposed to solve the imbalance problem, among most popular being oversampling techniques (such as SMOTE). These generate synthetic in class, balance dataset, performing data augmentations improve performance predictive (ML). In this paper, we advance novel, augmentation method (adapted eXplainable AI), generates synthetic, counterfactual class. Unlike other techniques, adaptively combines existing using actual feature-values rather interpolating values between instances. Several experiments four different classifiers and 25 involving binary classes are reported, which show Counterfactual Augmentation (CFA) useful datapoints The also CFA is competitive with methods, variants SMOTE. basis CFA’s discussed, along conditions under it likely perform better or worse future tests.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

the algorithm for solving the inverse numerical range problem

برد عددی ماتریس مربعی a را با w(a) نشان داده و به این صورت تعریف می کنیم w(a)={x8ax:x ?s1} ، که در آن s1 گوی واحد است. در سال 2009، راسل کاردن مساله برد عددی معکوس را به این صورت مطرح کرده است : برای نقطه z?w(a)، بردار x?s1 را به گونه ای می یابیم که z=x*ax، در این پایان نامه ، الگوریتمی برای حل مساله برد عددی معکوس ارانه می دهیم.

15 صفحه اول

A regularization method for solving a nonlinear backward inverse heat conduction problem using discrete mollification method

The present essay scrutinizes the application of discrete mollification as a filtering procedure to solve a nonlinear backward inverse heat conduction problem in one dimensional space. These problems are seriously ill-posed. So, we combine discrete mollification and space marching method to address the ill-posedness of the proposed problem. Moreover, a proof of stability and<b...

متن کامل

Class Imbalance Problem in Data Mining using Probabilistic Approach

Class imbalance problem are raised when one class having maximum number of examples than other classes. The classical classifiers of balance datasets cannot deal with the class imbalance problem because they pay more attention to the majority class. The main drawback associated with it majority class is loss of important information. The Class imbalance problem is a difficult due to the amount ...

متن کامل

Class Imbalance Problem in Data Mining Review

In last few years there are major changes and evolution has been done on classification of data. As the application area of technology is increases the size of data also increases. Classification of data becomes difficult because of unbounded size and imbalance nature of data. Class imbalance problem become greatest issue in data mining. Imbalance problem occur where one of the two classes havi...

متن کامل

A numerical technique for solving a class of 2D variational problems using Legendre spectral method

An effective numerical method based on Legendre polynomials is proposed for the solution of a class of variational problems with suitable boundary conditions. The Ritz spectral method is used for finding the approximate solution of the problem. By utilizing the Ritz method, the given nonlinear variational problem reduces to the problem of solving a system of algebraic equations. The advantage o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Machine learning with applications

سال: 2022

ISSN: ['2666-8270']

DOI: https://doi.org/10.1016/j.mlwa.2022.100375